Okay, then let's start, I guess. I see numbers are continuing to dwindle. We talked about
the value of information. The main thing here that you should remember is basically this
formula, i.e. we determine the value of information with respect to some particular random variable
f by the expected utility of the action that maximizes expected utility given that we know
that that random variable has a particular value f. I know that's a mouthful, but like
this general structure of the first part of the equation should be somewhat clear, I guess,
minus the expected utility we have if we don't know the value of that particular random variable
at all, i.e. basically just what we assume to gain in terms of utility from knowing that
that particular random variable has a particular value. And then apart from a couple of properties
that that has in particular that dpi is not additive, i.e. knowing the value of two random
variables is not just the sum of the values of the two distinct random variables. We know it's
always bigger than or at least zero, so non-negative. Apart from that, we talked about some examples to
gain a bit of intuition when value of information is actually large for some notion of large,
i.e. when I already know that some particular action is the one that is likely expected to
yield the better utility than information about the particular random variables involved is worth
rather little. We already know U1 in this particular case is the better choice anyway. If
they are very close together and have very narrow ranges in the first place, then 2. The amount of
utility we expect to gain from actually knowing the values of those particular random variables
is rather little. So the cases in which we are expected to actually gain utility from gathering
information usually look like the case in the middle where we have actions whose utilities to
some extent overlap and have a rather broad range in the first place so that knowing the value of
let's say the action that leads to U2 can be pinned down further than just this rather broad
distribution in the first place. We've talked about a very simple implementation of when to decide,
when to gather information in the first place, i.e. we basically just check all the random variables
where we can gain information about in the first place, compare their value of perfect information
with their cost and then if we find one of those where or the maximal one where the value of perfect
information is actually bigger than the cost of gathering that information in the first place,
then we just do that otherwise we just pick the best action we would have without gathering
information. That particular algorithm we call myopic in the sense that we really just consider
the random variable that is expected to maximize utility given that we know their value and if
it's not a single one where that is the case then we just act immediately i.e. what this algorithm
does not do is consider potentially gathering more information about other random variables
subsequently and so on and so forth. So that's not an ideal algorithm but it's like a good starting
point in the first place. There are strategies that are non-myopic but we're not going to go
into detail on those here. Apart from that we started talking about stochastic processes where
the assumption is that we have a sequence of random variables indexed by some time structure
where by time structure we basically in practice always ever just mean the natural numbers anyway
and these random variables all have the same domain. So the usual way that you should think
about this is we have one random variable at each particular time step in time. We have a couple of
notations in particular this one whenever we write some random variable x with a lower index
of a colon b for in our case natural numbers a and b we mean the sequence of random variables of xa,
xa plus 1, xa plus 2 and so on and so forth up until xb and just to keep the slides from
overflowing horizontally when we want to assign values to these sequences we just write it as an
upper index this equals e just to signify that for these particular random variables we assume
we actually know their values at some time step. Does that make sense so far? We're gonna look at
a couple of examples anyway. The random the running example that we're going to use is this
one where we assume we are some kind of security guard in an underground facility where we don't
actually observe the weather at all. The only possible evidence we have for what the weather
is currently doing is whether the director of that particular facility comes in with an umbrella or
doesn't. That gives us two stochastic processes one for whether it rains or not so we have a
Presenters
Zugänglich über
Offener Zugang
Dauer
01:26:44 Min
Aufnahmedatum
2024-05-16
Hochgeladen am
2024-05-16 21:39:03
Sprache
en-US